Speech synthesis using warped linear prediction and neural networks

نویسندگان

  • Matti Karjalainen
  • Toomas Altosaar
  • Martti Vainio
چکیده

A text-to-speech synthesis technique, based on warped linear prediction (WLP) and neural networks, is presented for high-quality individual sounding synthetic speech. Warped linear prediction is used as a speech production model with wide audio bandwidth yet with highly compressed control parameter data. An excitation codebook, inverse filtered from a target speaker’s voice, is applied to obtain individual tone quality. A set of neural networks, specialized to yield synthesis control parameters from phonemic input in specific contexts, generate the detailed parametric controls of WLP. Neural nets are also used successfully to compute the prosodic parameters. We have applied this approach in prototyping highly improved text-to-speech synthesis for the Finnish language.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Wideband Parametric Speech Synthesis Using Warped Linear Prediction

This paper studies the use of warped linear prediction (WLP) for wideband parametric speech synthesis. As the sampling frequency is increased from the usual 16 kHz, linear frequency resolution of conventional linear prediction (LP) cannot efficiently model the speech spectrum. By using frequency warping that weights perceptually the most important formant information, spectral models with bette...

متن کامل

Prediction of Gain in LD-CELP Using Hybrid Genetic/PSO-Neural Models

In this paper, the gain in LD-CELP speech coding algorithm is predicted using three neural models, that are equipped by genetic and particle swarm optimization (PSO) algorithms to optimize the structure and parameters of neural networks. Elman, multi-layer perceptron (MLP) and fuzzy ARTMAP are the candidate neural models. The optimized number of nodes in the first and second hidden layers of El...

متن کامل

Generalized source-filter structures for speech synthesis

In this paper we discuss various digital filter principles as models for synthetic speech generation. Warped linear prediction (WLP) and frequency-warped filters have been introduced earlier as a method to reduce the filter order in high-quality wideband speech synthesis. In addition to analyzing WLP and frequency-warped filters we introduce new related structures and techniques for arbitrary f...

متن کامل

Prediction of Gain in LD-CELP Using Hybrid Genetic/PSO-Neural Models

In this paper, the gain in LD-CELP speech coding algorithm is predicted using three neural models, that are equipped by genetic and particle swarm optimization (PSO) algorithms to optimize the structure and parameters of neural networks. Elman, multi-layer perceptron (MLP) and fuzzy ARTMAP are the candidate neural models. The optimized number of nodes in the first and second hidden layers of El...

متن کامل

Investigating Financial Crisis Prediction Power using Neural Network and Non-Linear Genetic Algorithm

Bankruptcy is an event with strong impacts on management, shareholders, employees, creditors, customers and other stakeholders, so as bankruptcy challenges the country both socially and economically. Therefore, correct prediction of bankruptcy is of high importance in the financial world. This research intends to investigate financial crisis prediction power using models based on Neural Network...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998